AITopics | image location

Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation

Neural Information Processing SystemsMar-15-2024, 16:02:39 GMT

Estimating 3D pose from monocular images is a highly ambiguous problem. Physical constraints can be exploited to restrict the space of feasible configurations. In this paper we propose an approach to constraining the prediction of a discriminative predictor. We first show that the mean prediction of a Gaussian process implicitly satisfies linear constraints if those constraints are satisfied by the training examples. We then show how, by performing a change of variables, a GP can be forced to satisfy quadratic constraints. As evidenced by the experiments, our method outperforms state-of-the-art approaches on the tasks of rigid and non-rigid pose estimation.

constraint, prediction, training example, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre:

Research Report (0.48)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

ProtoP-OD: Explainable Object Detection with Prototypical Parts

Rath-Manakidis, Pavlos, Strothmann, Frederik, Glasmachers, Tobias, Wiskott, Laurenz

arXiv.org Artificial IntelligenceFeb-29-2024

Interpretation and visualization of the behavior of detection transformers tends to highlight the locations in the image that the model attends to, but it provides limited insight into the \emph{semantics} that the model is focusing on. This paper introduces an extension to detection transformers that constructs prototypical local features and uses them in object detection. These custom features, which we call prototypical parts, are designed to be mutually exclusive and align with the classifications of the model. The proposed extension consists of a bottleneck module, the prototype neck, that computes a discretized representation of prototype activations and a new loss term that matches prototypes to object classes. This setup leads to interpretable representations in the prototype neck, allowing visual inspection of the image content perceived by the model and a better understanding of the model's reliability. We show experimentally that our method incurs only a limited performance penalty, and we provide examples that demonstrate the quality of the explanations provided by our method, which we argue outweighs the performance penalty.

activation, detection, prototype, (15 more...)

arXiv.org Artificial Intelligence

2402.19142

Country: Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Neural Information Processing SystemsApr-6-2023, 12:46:08 GMT

In this work we use Branch-and-Bound (BB) to efficiently detect objects with deformable part models. Instead of evaluating the classifier score exhaustively over image locations and scales, we use BB to focus on promising image locations. The core problem is to compute bounds that accommodate part deformations; for this we adapt the Dual Trees data structure to our problem. We evaluate our approach using Mixture-of-Deformable Part Models. We obtain exactly the same results but are 10-20 times faster on average.

dual-tree branch-and-bound, image location, rapid deformable object detection, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)
Information Technology > Artificial Intelligence > Vision (0.47)

Add feedback

Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Kokkinos, Iasonas

Neural Information Processing SystemsFeb-15-2020, 00:12:18 GMT

In this work we use Branch-and-Bound (BB) to efficiently detect objects with deformable part models. Instead of evaluating the classifier score exhaustively over image locations and scales, we use BB to focus on promising image locations. The core problem is to compute bounds that accommodate part deformations; for this we adapt the Dual Trees data structure to our problem. We evaluate our approach using Mixture-of-Deformable Part Models. We obtain exactly the same results but are 10-20 times faster on average.

dual-tree branch-and-bound, image location, rapid deformable object detection, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)
Information Technology > Artificial Intelligence > Vision (0.47)

Add feedback

Plan-Recognition-Driven Attention Modeling for Visual Recognition

Zha, Yantian, Li, Yikang, Yu, Tianshu, Kambhampati, Subbarao, Li, Baoxin

arXiv.org Artificial IntelligenceDec-1-2018

Human visual recognition of activities or external agents involves an interplay between high-level plan recognition and low-level perception. Given that, a natural question to ask is: can low-level perception be improved by high-level plan recognition? We formulate the problem of leveraging recognized plans to generate better top-down attention maps \cite{gazzaniga2009,baluch2011} to improve the perception performance. We call these top-down attention maps specifically as plan-recognition-driven attention maps. To address this problem, we introduce the Pixel Dynamics Network. Pixel Dynamics Network serves as an observation model, which predicts next states of object points at each pixel location given observation of pixels and pixel-level action feature. This is like internally learning a pixel-level dynamics model. Pixel Dynamics Network is a kind of Convolutional Neural Network (ConvNet), with specially-designed architecture. Therefore, Pixel Dynamics Network could take the advantage of parallel computation of ConvNets, while learning the pixel-level dynamics model. We further prove the equivalence between Pixel Dynamics Network as an observation model, and the belief update in partially observable Markov decision process (POMDP) framework. We evaluate our Pixel Dynamics Network in event recognition tasks. We build an event recognition system, ER-PRN, which takes Pixel Dynamics Network as a subroutine, to recognize events based on observations augmented by plan-recognition-driven attention.

artificial intelligence, machine learning, sequence, (15 more...)

arXiv.org Artificial Intelligence

1812.00301

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Learning what to look in chest X-rays with a recurrent visual attention model

Ypsilantis, Petros-Pavlos, Montana, Giovanni

arXiv.org Machine LearningJan-23-2017

X-rays are commonly performed imaging tests that use small amounts of radiation to produce pictures of the organs, tissues, and bones of the body. X-rays of the chest are used to detect abnormalities or diseases of the airways, blood vessels, bones, heart, and lungs. In this work we present a stochastic attention-based model that is capable of learning what regions within a chest X-ray scan should be visually explored in order to conclude that the scan contains a specific radiological abnormality. The proposed model is a recurrent neural network (RNN) that learns to sequentially sample the entire X-ray and focus only on informative areas that are likely to contain the relevant information. We report on experiments carried out with more than $100,000$ X-rays containing enlarged hearts or medical devices. The model has been trained using reinforcement learning methods to learn task-specific policies.

artificial intelligence, machine learning, medical device, (16 more...)

arXiv.org Machine Learning

1701.06452

Country:

Europe > United Kingdom (0.15)
Europe > Spain (0.14)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Nuclear Medicine (0.93)
Health & Medicine > Therapeutic Area (0.72)
Health & Medicine > Diagnostic Medicine > Imaging (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Rapid Deformable Object Detection using Dual-Tree Branch-and-Bound

Kokkinos, Iasonas

Neural Information Processing SystemsDec-31-2011

In this work we use Branch-and-Bound (BB) to efficiently detect objects with deformable part models. Instead of evaluating the classifier score exhaustively over image locations and scales, we use BB to focus on promising image locations. The core problem is to compute bounds that accommodate part deformations; for this we adapt the Dual Trees data structure to our problem. We evaluate our approach using Mixture-of-Deformable Part Models. We obtain exactly the same results but are 10-20 times faster on average. We also develop a multiple-object detection variation of the system, where hypotheses for 20 categories are inserted in a common priority queue. For the problem of finding the strongest category in an image this results in up to a 100-fold speedup.

artificial intelligence, contribution, detection, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.87)

Add feedback

Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation

Salzmann, Mathieu, Urtasun, Raquel

Neural Information Processing SystemsDec-31-2010

Estimating 3D pose from monocular images is a highly ambiguous problem. Physical constraints can be exploited to restrict the space of feasible configurations. In this paper we propose an approach to constraining the prediction of a discriminative predictor. We first show that the mean prediction of a Gaussian process implicitly satisfies linear constraints if those constraints are satisfied by the training examples. We then show how, by performing a change of variables, a GP can be forced to satisfy quadratic constraints. As evidenced by the experiments, our method outperforms state-of-the-art approaches on the tasks of rigid and non-rigid pose estimation.

artificial intelligence, constraint, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States > California (0.46)

Genre:

Research Report (0.48)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)

Add feedback

A Biologically Plausible Model for Rapid Natural Scene Identification

Ghebreab, Sennay, Scholte, Steven, Lamme, Victor, Smeulders, Arnold

Neural Information Processing SystemsDec-31-2009

Contrast statistics of the majority of natural images conform to a Weibull distribution. This property of natural images may facilitate efficient and very rapid extraction of a scenes visual gist. Here we investigate whether a neural response model based on the Weibull contrast distribution captures visual information that humans use to rapidly identify natural scenes. In a learning phase, we measure EEG activity of 32 subjects viewing brief flashes of 800 natural scenes. From these neural measurements and the contrast statistics of the natural image stimuli, we derive an across subject Weibull response model. We use this model to predict the responses to a large set of new scenes and estimate which scene the subject viewed by finding the best match between the model predictions and the observed EEG responses. In almost 90 percent of the cases our model accurately predicts the observed scene. Moreover, in most failed cases, the scene mistaken for the observed scene is visually similar to the observed scene itself. These results suggest that Weibull contrast statistics of natural images contain a considerable amount of scene gist information to warrant rapid identification of natural images.

artificial intelligence, machine learning, natural image, (16 more...)

Neural Information Processing Systems

Country: Europe > Netherlands (0.15)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: